Multilingual Ontology Enrichment for Semantic Annotation and Retrieval of Medical Information
نویسنده
چکیده
Background: Knowledge management in the European project Noesis addresses concept-based annotation and multilingual Information Retrieval of documents. Objective: Multilingual enrichment of a concept-based terminology in the medical field. Experience and evaluation in the domain of cardiovascular diseases by enriching a subset of the MeSH thesaurus in six European languages. This terminology, represented in the OWL standard ontology language, has been used for manual semantic annotation of medical texts, for automatic indexing and Information Retrieval. Methods: A subset of 2500 cardiovascular-related concepts has been extracted from the MeSH thesaurus and represented in OWL. Multilingual enrichment has been done in two phases: automatically, through the UMLS metathesaurus and its associated vocabulary, and manually, by considering the terms extracted from a corpus of English texts of the cardiovascular domain. Manual enrichment from the English corpus was done first; it consisted in extracting the candidate terms, selecting them and determining the concepts they represent. The vocabulary for the five other considered languages has been obtained mostly by translation. A computer environment based on ontology edition has been designed to support the enrichment task. Results: Each concept is represented on average by 3.2 terms in the initial English MeSH vocabulary. After automatic enrichment through UMLS the average is 7 and eventually 8.5 after manual enrichment from a corpus of texts. For the other languages, the UMLS enrichment yields from 1 term (Italian) to 4 terms (Spanish) per concept. The environment supporting multilingual enrichment is publicly available. Conclusions: The choice of the MeSH thesaurus was a reasonable decision in the context of the Noesis project. However, the poorness of its structure, which is far from an ontology, limited the work to terminology enrichment, while the need for new concepts and a better structure was often experienced.
منابع مشابه
Enrichment of French Biomedical Ontologies with UMLS Concepts and Semantic Types for Biomedical Named Entity Recognition Though Ontological Semantic Annotation
Medical terminologies and ontologies are a crucial resource for semantic annotation of biomedical text. In French, there are considerably less resources and tools to use them than in English. Some terminologies from the Unified Medical Language System have been translated but often the identifiers used in the UMLS Metathesaurus, that make its huge integrated value, have been lost during the pro...
متن کاملPublic Transport Ontology for Passenger Information Retrieval
Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...
متن کاملKnowledge Representation and Management. From Ontology to Annotation. Findings from the Yearbook 2015 Section on Knowledge Representation and Management.
OBJECTIVE To summarize the best papers in the field of Knowledge Representation and Management (KRM). METHODS A comprehensive review of medical informatics literature was performed to select some of the most interesting papers of KRM published in 2014. RESULTS Four articles were selected, two focused on annotation and information retrieval using an ontology. The two others focused mainly on...
متن کاملCross Document Ontology based Information Extraction for Multimedia Retrieval
This paper describes the MUMIS project, which applies ontology based Information Extraction to improve the results of Information Retrieval in multimedia archives. It makes use of a domain specific ontology, multilingual lexicons and reasoning algorithms to automatically create a semantic annotation of sources. The innovative aspect is the use of a cross document merging algorithm that combines...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کامل